Transcribing human-directed speech for spoken language processing
نویسنده
چکیده
As storage costs drop and bandwidth increases, there has been a rapid growth of spoken information available via the web or in online archives, raising problems of document retrieval, information extraction, summarization and translation for spoken language. While there is a long tradition of research in these technologies for text, new challenges arise when moving from written to spoken language. In this talk, we look at differences between speech and text, and how we can leverage the information in the speech signal beyond the words to provide structural information in a rich, automatically generated transcript that better serves language processing applications. In particular, we look at three interrelated types of structure (orthographic, prosodic, and syntactic), methods for automatic detection, the benefit of optimizing rich transcription for the target language processing task, and the impact of this structural information in tasks such as information extraction, translation, and summarization.
منابع مشابه
On the difficulty of a distributional semantics of spoken language
The bulk of research in the area of speech processing concerns itself with supervised approaches to transcribing spoken language into text. In the domain of unsupervised learning most work on speech has focused on discovering relatively low level constructs such as phoneme inventories or word-like units. This is in contrast to research on written language, where there is a large body of work on...
متن کاملClassification-based spoken text selection for LVCSR language modeling
Large vocabulary continuous speech recognition (LVCSR) has naturally been demanded for transcribing daily conversations, while developing spoken text data to train LVCSR is costly and time-consuming. In this paper, we propose a classification-based method to automatically select social media data for constructing a spoken-style language model in LVCSR. Three classification techniques, SVM, CRF,...
متن کاملTranscribing Speech: Errors in Corpora and Experimental Settings
Administrations, government organs, judiciary courts always faced the problem of defining limits in transcription practices. Nowadays corpus linguistics and computational linguistics have focused their attention on spoken corpora as indispensable tools for descriptive linguistics, as well as for applied purposes (in speech technologies, such as text-to-speech and speech recognition, in dialogue...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملCrowd-supervised training of spoken language systems
Spoken language systems are often deployed with static speech recognizers. Only rarely are parameters in the underlying language, lexical, or acoustic models updated on-thefly. In the few instances where parameters are learned in an online fashion, developers traditionally resort to unsupervised training techniques, which are known to be inferior to their supervised counterparts. These realitie...
متن کامل